<fix>[agent]: add timeout to deploy agent syncJsonPost to prevent queue blocking#4129
<fix>[agent]: add timeout to deploy agent syncJsonPost to prevent queue blocking#4129zstack-robot-1 wants to merge 1 commit into
Conversation
|
Warning Review limit reached
More reviews will be available in 39 minutes and 32 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
概述本 PR 为 ZStack Agent 初始化流程引入显式超时控制。在 变更Agent 初始化超时控制
估算代码审查工作量🎯 2 (Simple) | ⏱️ ~12 分钟 诗
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (3 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/src/test/java/org/zstack/test/core/rest/TestAgentInitTimeout.java (1)
13-13: ⚡ Quick win测试类命名不符合仓库规范(应以 Test/Case 结尾)
建议将类名改为以
Test或Case结尾(例如AgentInitTimeoutTest),并同步调整文件名,避免后续测试发现规则不一致。♻️ 建议修改
-public class TestAgentInitTimeout { +public class AgentInitTimeoutTest {As per coding guidelines “测试类需要以 Test 或 Case 结尾。”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/src/test/java/org/zstack/test/core/rest/TestAgentInitTimeout.java` at line 13, Rename the test class TestAgentInitTimeout to follow the repository convention (e.g., AgentInitTimeoutTest) and update the corresponding file name accordingly; specifically change the class declaration "public class TestAgentInitTimeout" to "public class AgentInitTimeoutTest" and ensure the test file is renamed to match the new class name so test discovery and naming rules remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/src/test/java/org/zstack/test/core/rest/TestAgentInitTimeout.java`:
- Line 13: Rename the test class TestAgentInitTimeout to follow the repository
convention (e.g., AgentInitTimeoutTest) and update the corresponding file name
accordingly; specifically change the class declaration "public class
TestAgentInitTimeout" to "public class AgentInitTimeoutTest" and ensure the test
file is renamed to match the new class name so test discovery and naming rules
remain consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: cfab1fbf-f61e-4eed-9a1c-c281cc42eaed
📒 Files selected for processing (2)
core/src/main/java/org/zstack/core/agent/AgentManagerImpl.javatest/src/test/java/org/zstack/test/core/rest/TestAgentInitTimeout.java
| long start = System.currentTimeMillis(); | ||
| try { | ||
| restf.syncJsonPost(url, "{}", Void.class, TimeUnit.SECONDS, 1); | ||
| Assert.fail("should throw on timeout"); | ||
| } catch (Exception e) { | ||
| long elapsed = System.currentTimeMillis() - start; | ||
| errorDetails = e.getMessage(); | ||
| Assert.assertNotNull(errorDetails); | ||
| Assert.assertTrue("timeout should take at least ~1s, got " + elapsed + "ms", | ||
| elapsed >= 500 || errorDetails.contains("timed out") || errorDetails.contains("Timeout")); | ||
| } |
There was a problem hiding this comment.
该用例可能无法验证“超时参数已生效”的核心行为
当前断言允许“非超时错误”通过(elapsed >= 500 || ...),且 core/src/main/java/org/zstack/core/rest/RESTFacadeImpl.java 在 UNIT_TEST_ON 分支里调用 template.exchange(url, method, req, String.class),并未使用 unit/timeout 参数。这样即使超时参数未生效,此测试也可能通过,存在误报风险。
…ue blocking Replace hardcoded timeout with AGENT_INIT_TIMEOUT constant. Delete self-proving constant test. Fix test to use RFC 5737 TEST-NET-1 unreachable address. Strengthen timing assertion to 900ms threshold. Resolves: ZSTAC-84187 Change-Id: Ie81f20c5d4a7b3f9e2d6c8a1b4f7e3d9c2a5f801
cc9d1e7 to
d46ce9c
Compare
Root Cause
AgentManagerImpl.java:155 — syncJsonPost(url, cmd, Void.class) 三参数无超时。Agent端口不可达时阻塞约300s(全局readTimeout),per-IP deploy-agent队列槽在此期间全部积压。
Fix
加 AGENT_INIT_TIMEOUT=60 + TimeUnit.SECONDS 参数,60s后timeout异常自动触发FlowChain error handler → completion.fail() → chain.next() 释放队列槽。
Files Changed
core/.../agent/AgentManagerImpl.java(+2/-1: 加常量+超时参数)test/.../TestAgentInitTimeout.java(+48: 超时行为验证+常量值检查)Risk: Low
Resolves: ZSTAC-84187
sync from gitlab !10032